• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ReLU°¡ ÇÕ¼ºµÈ Çà·Ä °ö ¿¬»êÀÇ ºÎºÐ »ý·«À» ÅëÇÑ µö ·¯´× ¸ðµ¨ Ãß·Ð ½Ã°£ °³¼±
¿µ¹®Á¦¸ñ(English Title) Improving the Inference Time of the Deep Learning Model with Partial Skip of ReLU-fused Matrix Multiplication Operations
ÀúÀÚ(Author) ±è¼º±Õ   ¾È°ÇÁÖ   ±è³ªÈÆ   ¼­Áö¿ø   Sungkyun Kim   Gunjoo Ahn   Nahum Kim   Jiwon Seo  
¿ø¹®¼ö·Ïó(Citation) VOL 28 NO. 03 PP. 0139 ~ 0145 (2022. 03)
Çѱ۳»¿ë
(Korean Abstract)
ÃÖ±Ù µö ·¯´×ÀÇ È°¿ë ºÐ¾ß°¡ ³Ð¾îÁö´Â Ãß¼¼À̸ç, ¸¹Àº ÆĶó¹ÌÅ͵éÀ» °¡Áö°í ÀÖ´Â Large-Scale µö ·¯´× ¸ðµ¨µéÀÌ ÁÁÀº ¼º´ÉÀ» º¸ÀÌ´Â °æÇâÀÌ ÀÖ´Ù. ±×¸®°í Å©±â°¡ Å« ¸ðµ¨À» ÀÌ¿ëÇÑ µö ·¯´× Ãß·ÐÀº ÇÊ¿¬ÀûÀ¸·Î ¸¹Àº ÀÚ¿ø°ú ±ä ½Ã°£À» ¿ä±¸ÇϹǷΠµö ·¯´× ¸ðµ¨ÀÇ È¿À²ÀûÀÎ È°¿ëÀ» À§Çؼ­´Â Ãß·Ð ½Ã°£ÀÇ ´ÜÃàÀÌ ÇʼöÀûÀ¸·Î ¿ä±¸µÈ´Ù. º» ³í¹®¿¡¼­´Â µö ·¯´× Ãß·Ð °úÁ¤¿¡¼­ È°¼ºÈ­ ÇÔ¼öÀÎ Rectified Linear Unit °úÇà·Ä °öÀ» À¶ÇÕÇÏ°í, µÎ ¿¬»ê°úÁ¤¿¡¼­ °è»êÇÒ Ãâ·Â °ªÀÇ ºÎÈ£¸¦ ¹Ì¸® ¿¹ÃøÇÏ¿© °è»êÀÇ ¾çÀ» ÁÙÀÌ´Â ³× °¡Áö ¹æ¹ýÀ» Á¦¾ÈÇϸç, ³× °¡Áö °è»ê »ý·« ¹æ¹ýÀÇ ºñ±³¸¦ ÅëÇØ Á¤È®µµ¸¦ °ÅÀÇ ÇØÄ¡Áö ¾Ê´Â ¼±¿¡¼­ °è»êÀÇ ¾çÀ» ÁÙ¿© Ãß·Ð ½Ã°£À» Àý¾àÇÏ´Â ÃÖÀûÀÇ ¹æ¾ÈÀ» µµÃâÇÑ´Ù.
¿µ¹®³»¿ë
(English Abstract)
Deep learning has expanded its utilization, and large-scale deep learning models containing many parameters tend to perform well. As large-scale models inevitably require many resources and long inference time, reducing the inference time is essential for efficient utilization of deep learning models. We fuse the activation function Rectified Linear Unit and matrix multiplication in the inference process, and reduce the amount of computation by predicting the sign of the output values to be computed in the computational processes. We propose four methods for reducing the computation and derive an optimal method that saves inference time with low accuracy loss by reducing the amount of computation by comparing these four methods.
Å°¿öµå(Keyword) µö ·¯´× ÃÖÀûÈ­   °è»ê »ý·«   ¿ÏÀü ¿¬°á ·¹À̾ Ãß·Ð ÃÖÀûÈ­   deep learning optimization   omitted computation   fully-connected layer   inference optimization  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå